Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
1.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ; 13610 LNCS:23-32, 2022.
Article in English | Scopus | ID: covidwho-2173854

ABSTRACT

Biomedical named entity recognition is becoming increasingly important to biomedical research due to a proliferation of articles and also due to the current pandemic disease. This paper addresses the task of automatically finding and recognizing biomedical entity types related to COVID (e.g., virus, cell, therapeutic) with tolerance rough sets. The task includes i) extracting nouns and their co-occurring contextual patterns from a large BioNER dataset related to COVID-19 and, ii) annotating unlabelled data with a semi-supervised learning algorithm using co-occurence statistics. 465,250 noun phrases and 6,222,196 contextual patterns were extracted from 29,500 articles using natural language text processing methods. Three categories were successfully classified at this time: virus, cell and therapeutic. Early precision@N results demonstrate that our proposed tolerant pattern learner (TPL) is able to constrain concept drift in all 3 categories during the iterative learning process. © 2022, Springer-Verlag GmbH Germany, part of Springer Nature.

2.
International Journal of Advanced Computer Science and Applications ; 13(5), 2022.
Article in English | ProQuest Central | ID: covidwho-1924733

ABSTRACT

The analysis of the unstructured text has become a challenge for the community dedicated to natural language processing (NLP) and Machine Learning (ML). This paper aims to describe the potential of the most used NLP techniques and ML algorithms to address various problems afflicting our society. Several original articles were reviewed and published in SCOPUS during 2021. The applied approach was retrospective, transversal and descriptive. The data collected were entered into the SPSS statistical software v25 and among the findings, it was determined that the most used NLP technique was the Term frequency - Inverse document frequency (TF-IDF), while the most used supervised learning algorithm was the Support Vector Machines (SVM). Likewise, the predominant deep learning algorithm was Long Short-Term Memory (LSTM). This research aims to support experts and those starting in research to identify the most used algorithms of NLP and ML.

3.
2021 IEEE Cloud Summit, Cloud Summit 2021 ; : 7-12, 2021.
Article in English | Scopus | ID: covidwho-1707014

ABSTRACT

The coronavirus COVID-19 pandemic has become the center of concern worldwide and hence the focus of media attention. Checking the coronavirus-related news and updates has become a daily routine of everyone. Hence, news processing and analytics become key solutions to harvest the real value of this massive amount of news. This conscious growth of published news about COVID-19 makes it hard for a variety of audiences to navigate through, analyze, and select the most important news (e.g., relevant information about the pandemic, its evolution, the vital precautions, and the necessary interventions). This can be realized using current and emerging technologies including Cloud computing, Artificial Intelligence (AI) and Deep Learning (DL). In this paper, we propose a framework to analyze the massive amount of public Covid-19 media reports over the Cloud. This framework encompasses four modules, including text preprocessing, deep learning, and machine learning-based news information extraction, and recommendation. We conducted experiments to evaluate three modules of our framework and the results we have obtained prove that combining derived information from the news reports provides the policymakers, health authorities, and the public, a complete picture of the way this virus is proliferating. Analyzing this data swiftly is a powerful tool to provide imperative answers to questions that are relevant to public health. © 2021 IEEE.

4.
JMIR Med Inform ; 9(2): e25457, 2021 Feb 10.
Article in English | MEDLINE | ID: covidwho-1032549

ABSTRACT

BACKGROUND: Medical notes are a rich source of patient data; however, the nature of unstructured text has largely precluded the use of these data for large retrospective analyses. Transforming clinical text into structured data can enable large-scale research studies with electronic health records (EHR) data. Natural language processing (NLP) can be used for text information retrieval, reducing the need for labor-intensive chart review. Here we present an application of NLP to large-scale analysis of medical records at 2 large hospitals for patients hospitalized with COVID-19. OBJECTIVE: Our study goal was to develop an NLP pipeline to classify the discharge disposition (home, inpatient rehabilitation, skilled nursing inpatient facility [SNIF], and death) of patients hospitalized with COVID-19 based on hospital discharge summary notes. METHODS: Text mining and feature engineering were applied to unstructured text from hospital discharge summaries. The study included patients with COVID-19 discharged from 2 hospitals in the Boston, Massachusetts area (Massachusetts General Hospital and Brigham and Women's Hospital) between March 10, 2020, and June 30, 2020. The data were divided into a training set (70%) and hold-out test set (30%). Discharge summaries were represented as bags-of-words consisting of single words (unigrams), bigrams, and trigrams. The number of features was reduced during training by excluding n-grams that occurred in fewer than 10% of discharge summaries, and further reduced using least absolute shrinkage and selection operator (LASSO) regularization while training a multiclass logistic regression model. Model performance was evaluated using the hold-out test set. RESULTS: The study cohort included 1737 adult patients (median age 61 [SD 18] years; 55% men; 45% White and 16% Black; 14% nonsurvivors and 61% discharged home). The model selected 179 from a vocabulary of 1056 engineered features, consisting of combinations of unigrams, bigrams, and trigrams. The top features contributing most to the classification by the model (for each outcome) were the following: "appointments specialty," "home health," and "home care" (home); "intubate" and "ARDS" (inpatient rehabilitation); "service" (SNIF); "brief assessment" and "covid" (death). The model achieved a micro-average area under the receiver operating characteristic curve value of 0.98 (95% CI 0.97-0.98) and average precision of 0.81 (95% CI 0.75-0.84) in the testing set for prediction of discharge disposition. CONCLUSIONS: A supervised learning-based NLP approach is able to classify the discharge disposition of patients hospitalized with COVID-19. This approach has the potential to accelerate and increase the scale of research on patients' discharge disposition that is possible with EHR data.

SELECTION OF CITATIONS
SEARCH DETAIL